t 



DOCUMENT RESUME 



i-AED 254 321 



TITLE 



INSTITUTION 



SPONS 



i^GENC^ 



PS 014 938 

Path-Referenced Assessment for Head Start Children, 
The Head Start Measures Project: Executive 
Summary. 
Arizona Univ. , 
Evaluation and 
mini strat ion 



PUB DATE 
CONTRACT 
MOTE ' . . 
PUB TYPE 



EDRS PRICE 
DESCRIPTORS 



Tucson . Center 
Measurement . 
for Children 



HilS 



HHS) , Washington, D.C. 
n 84 
-105-81-C-008 



for Educational 
Youth, and Families 



Reports - Research/Technical 
Descriptive (141) 



il43) ~ Reports - 



IDENTIF^^S 



ABSTRACtT 



MF01/PC03 Plus Postage. 

*Achiev^ent Tests; Early Chili^hood Education; 
General Science; Language Acquisition; Mathematics 
Skills; Measurement Techniques; *Preschool Tests; 
Pretesting; Reading Skills; Social Devel'opment ; 
Spanish; *Test Construction; Test Interpretation; 
Test Items; Test Length; Test Reliability; Test- 
Validity; Visual Percept ioh ^ 
*P^h Referenced Tests; *Project H^aA Start; Science 
Skills J- 



to 



The Head Start Measures Pro:>ect was a 3-year study 
develop a se^ of measures designed specifically for Head Start 
children. The measures are based on a path-referenced approach to 
assessment, in which children's* performance is described ip terms of 
their position along paths of development. A path isdefined as a 
sequence of skills within a content area that is ordered by 
difficulty. A path-referenced test scare not only indicates what the 
child has achieved but^^^co details the skills the chi}.d is likely to 
master as development a Ip^rata^ess continues. The result of the* project 
is the Head Is^art ^MeasuresjBattery (HSMB), consisting of six Scales: ' 
Language, M^th, Nature an(^>6ience, Perception ^ Reading, and S6cial 
DevelofHnent . There are y^^ons for Spanish- speaking and 
English-speaking childjpmi. In a brief and nontechnical fashion, this 
Depart summarizes asj^^cts of the ptoject described in more detail in 
o.tHer publicat ions^^^iapter I describes the background of the Head , 
Start Measures Pr^^ct and the path-referenced approach to 
assessment. Chapter II describes the HSkB, the 1982-83 field test 
results, and the uses of the measures. Chapter III describes the 
development and evaluation process, the psychometris mroperties of 
the measures, and the results of research relating program 
characteristics to achievement. Chapter IV dfscribei^ a Current pilot 
project, which involves the dissemination of the measures and their 
use by a s'ample of 30 Head Start prograoTs. (CB) 



****'**************************************** 

* Reproductions supplied by EDRS are the best that can be' made * 

* from the original document. * 



t 



erJc 



Prepared for .the Head Start Bureau ^ Administration for 
Children, Youth and Families ♦ Office of rtuman , 
^Development Services * Dfepartrhent of Health arid 
I^aHuman Services* uAD^AOTwwTwwocATioii ^ 

I^ M ii ■ » I ■ I ■ M l ■ ■■ • [ ■ , I. I , . ■m, ^ MATIO^MM. MSrrrUTf Of £[HICATlOI« , ,Mn^p..p I ___ 

EOUCATK>\Al RfSOUflC^S mmRMATH^ 

a ' • "^Thtft doctimsnt hot been mproduced m 

^ ^e <yg d hx)m the person or (H^mnrBtion ^ 

^ ' Mmor chsnges hisve d|ton meds to improve 

repfoducnoo «^ietitv ' ^ - • ♦ 

• fNxfKs of v«ew or aptfMMw titled in ttne docu ^ 
ment do noi n&tstm^ represent ofCicisi N(€ 
poerfion or pobcy 

EXECUTIVE SUMMAI^f 

The I^ad Start Measures Project 




erenced Assessment 
^or He^i'd Statt Children 

Tfie Cent^ for Educational Evaluation and Measurement 
^J^e University of Arizona^* Tucson; Ari^na 

John R. Bergan, ^ojec^irector . 
^JMien N. Smith, Oovemm^fit Project Officer 
QContract No. 1O5-81-C-O08 

■ ■ ^ PCWMISSION TO PRODUCE THIS 

MATERIAL HAS BtfN GRANTEO BY 

ERJC * • Q LOUCATIONAl RESOURCF^ 

mOMBam • lC INFORMATION CENTER (ERIC) " 




N 



9 



ERIC 



Prepared for the Head Start tareau o AAtlnisttHtloii for 
Cbildren, Tovth and Paallies o Office of Hman DmrelopMiit 
Snrricmn o U^S^ Dagart—nt of Health aad Bwan Sarvlces 



■ ■ ■ 

THE BEAD START MKASURKS piOJECT 

PATH-pSPERENCED ASSESSMENT 
POE HEAD START CHILDREH 



Jnu. 1984 



By The Center for Educational Evaluation and Heasure«ent 

The University of Arizona o TMson, ArixMa 

Jobn R. Bergan, Project Director 

Allen M. Saith, GoremMnt Project Officer 

Contract Mo. HHS- 103-6 l<C-e08 



3 



r 



This Executive Sumoary was^ prepared pursuant to Contract 
Number HHS-105-81-C-OOd. the statements and conclusions 
contained herein are those of the Center for Educational 
Evaluation and Hea surement. University of Arizona t and' 
do n^t necessarily reflect the views of the sponsoring 
^agenc^ll ,^ 




Table of C<mt«Bt« 



Chapter It BACKGROUND AND APPROACH 



lotroductlon 

Tha th-Referenced Approach to Assesanant 



5 
6< 



Chap tar II: OVERVIEtf OF THE HSNB 8 

Tha Haad Start Haaauras_^ttary 8 
TJm HSMB was Davelopad for Haad SUrt Children 8 

Content of the HSNB ^ . 9 

Adalnlatraeion of the HSMB 12 
Tha HSMB is for English and Spanish 

Speaking Children 12 

Resulta of the 1982-83 Field Test 13 

Significant PsjrclMMatric Faatarea 14 

Scoring and Repprting Serricea 15 

Scores and Profiles for the HSMB 16 
Using HSMB Results to Plan for Lnamittg ^ 18 

lap lenen ting the HSNB 19- 

Chapter III: HEASURES DEVELOPHEMT AND EVALUATION 20 



THE MEASURES DEVELOPMENT PROCESS 
^ PmtlultttfttrMced HMisur«8«Cim«tritc.tlim 
Tfichnology 
Cultural and Linguistic Consiileratioas 
in Taat Cona traction 
FIELD TEST AND RESULTS 

Tba National Fiald Test Saapla 
Data Collaction ^ 
Achiaveaent Raaulta ft 
PajrchOMtric Propartiaa of thm HSHB # 
Program Variables Aaaociated With 
Achiavenant on tha^HSMB ^ 

Chapter IV: DISSEMINATION AND UTILIZATION 
PILOX^STUDY 
Study Daal^ 

Davelopnant and Production 
/Participating Sitaa and Children 
laplenenta tion 
Evaluation 



20 

20 

22 
23 
24 
27 
29 
30 



ll 



REFERENCES 



/ 



45 
45 
46 

ui 

47 
48 

50 



C 



ERIC 



t 



List of Tables and Figorss 



Ethnic Coaposition of National Field Tast S4apl« 

Aga Ranges of Childraa in National Field Test 
Sanple -> 

KR-20 Reliability Coefficients for the Six Scales 
of the HSNB 

Mean Nunber and Percent of Spanish I tens Judged 
Adequate by Reviewers 



Hath Scale Profile Report: Cotmting Snbscale 

Pre- and Posttest Path Scores on the Head SUrt 
Measurea Battery - National Field Test, 1982-83 

Percentage of Teachers who Reported Teaching Selected 
Language Skills - Spring 1983 

PerceoUgc of Teachers Who Reported Teaching Selected 
Hath Ski 11a - Spring 1983 



6 



BACKGIODIID AND APPVOACU 



Introductioo ■ k . 

The Head S tart Heasuires Project a three-year (1981-1984) 
study funded by the AdnlnistratUtt for Childrea, Youth 
and Famlliea (ACYP), Office of Human DeTelopaent Services, 
Departae.nt of Health and Hunan Services. The purpose 
of the project ms to develop a set of fliea suras designed 
specifically for Head Start children. ACYF felt that 
a battery of Measures ims needed that iiould asslat progran 
administrators i|i facilitating ctiildren's cognitive and 
social development. A **path- referenced** approach to assess- 
ment was created to measure children's growth in six areas. 
The measures were field tested at three phases li< their 
development. In addition, data'were collected on several 
program characteristics In order to examine their relation- 
ship to achievement on the measures. The result of the 
project is the Head Start Measures Battery (HSHB) consisting 
of six scales: Language, fteth, Nature and Science, Percep- 
tion, Reading, and Socli^l Development. 

. This report contains four chapters. Chapter I describes 
the background of the Head Start Measures Project and 
the pa th- referenced approach to assessment. Cluipter II 
describes the Head Start Measures Battery, the 1982-83 
field test results, and the uses of the measures In a 
relatively brief and nontectn^cal fashioi^. It Is tlesi^ed 
for readers who are Interest^ In an overview of the ^Jor 
features of the assessment system. Mor^ detalle^l fnfor- 
mation can be found In the HSHB Exmadner'^^ft Hmmnal. ^hapter 
III describes the development and evaluation proce.ss 
during which the measures were created a^d refined^ the 
psychometric properties of the measuref, and the results 
of the research relating program characteristics to achleve- 
snent. More detailed Information on the development and 
evaluation .prooess and the properties of the measures 
can be found in the HSHB Tectamical damaal. Chapter IV 
describes a pilot project that Is curirently in progress. 



The project Involves the dissemination of the measures 
and their u«e by a sample of 30* Head Start programs^ 
A complete report on the project from the period of August^ 
1981 to December, 1983 cad. be found in Eergan et al. (1984) 
U«ad StMit (toaasres Batt^r^ Final Report*. 

Tha^ Path-lefanmced Arproach tb Asaeanmt 

Early childhood educators have generally sought three 
fclnd0%f informaCion ^toB assesasent prograais: Information 
on the relative standing of children in notm groups, infor- 
mation about the attaiaiaent of educational objectives, 
and"* information about children's developmetit associated 
with educational estperlences. Norm- referenced assessmant 
has provided inf orma tion about re la tiv^ standing. 
Criterion- referenced assessment has afforded information 
about tMte mastery of obje^t^ives. Unfortunately, imtil 
recently there has been no adequate procedure^for assessing 
cognitive and social developaent% Consequently, Head 
Start programs liave^.blid to rely heavily <m norm** referenced 
and- criterion- referenced measures. 

1 ' - • ' 
« The HSNB vas designed to provide Information about 
development for use in providing for children's learning 
needs in Head Starts programs. The measures are based 
on a pa th- referenced approach to* asaessmtent (Bergan, 1981; 
Bergan, Stone, & Peld, in press), that dascribes children's 
performance in ter|^^ of their i>ositlon along paths of 
development*" Tlie pa th^re^renced approach has several 
key features that distinguish it from othe^ approaches 
to assessment:* (1) A * pa th^ref erenced measure assesses 
a child's posi^on in a validated developmental sequence. 
(2) The child^psth position reveals the specific skills 
that the' child has mastered and the skilla the child irtU 
need to mastet. (3) Progress is iaeasured on a quantitative 
scale vhich 'indicates the skills acquired during the course 
of development* 

A path is defined as a sequence of skills ulthln a 
content area that la ordered by difficulty. The sequence 
of skills reprasents the develofmental route, or path, 
that children are likely to follow as they master increasingly 
more difficult skills. ^th sequences are empirically 
validated. Val|datloti of ]^he HSMB was bssed on data from 
over l,00a Head Start children. 



The fact that a pa th- referenced test Is designed to 
measure development implies assessment and educational 
practices that are quite different from those ap9ropriate 
with other assessment technologies « A developmental perspec- 
tive on measurement calls '^for assessment devices that 
identify not only what the child YwkS accMiplishedj but 
also vhat new learning challenges lie*ahead. 

The frequently Osed criterion-referenced strategy 
of diescrlbing children's achievement in terms of the proportion 
of skills they have i^tered in a content area would not 
assist the teacher to target classroom activities appropriate 
for each indlvldt»l cl^ild's learning needs, since objectives 
based on such ifesults may reflect skills well below a 
child's current developmental level. Onder these.condi tlons/* 
knowing the child has mastered the established objectives 
tells little about the developmental level of the ^llls 
the child acttially possesses and undere? timates the develop-* 
* mental level of the child. A path-referenced test score 
indicates not only what the child has achieved, but also 
details the skills the child is likely to master as develop- 
nental progress occurs. ^ 

Path- referenced assessment links gains in achievement 
directly to changes in developmental level reflecting 
the acfquisition of speclflic compei^encies. In contrast, 
morm-^ref erenced instruments describe progress In terms 
of changes in relative standing In a norm group. Further, 
the norm- referenced approach does not indicate the kinds 
of skills assoclaied with Flia t change. Moreover, the 
measure of change is n^t independent of * the child being 
assessed. For example, a six-month gain for a below average 
child does not mean the same thing as a six-month gain 
for an above average child (Linn, 1981). ^ 

By providing path scores that are referenced to specific 
skills that are ordered to reflect developmental sequences, 
the path-referenced approach describes children's competencies 
In terms of ^helr own past and future progress along a 
path. Rather than comparing children to one another, 
or to a norm group , pa th scores and profiles provide a 
means far viewing progress of an individual child within 
a developmental framework. 



-7- 



ERLC 



II 

> ^ 

OVKRVIW OF THE HSMB 



The HmMd Start Measmres tottery 

^ The Head Start Measures Battery (HSMB) Is a set of 

six path- referenced tests designed to assess the cognitive 

a nd soc la 1 deve lop«en t of chl Idren aged 3 through ^ » 
The Battery consists of six scales: 

o Language 
o Hath 

o Nature and Science 

o Perception 

o Reading 

o Social Developsent 

The HSMB ims developed for the Head Start program tiirough 
' funding provided by the Administration for Children, Youth 
and Families*- Development and evaluation wer^ carried 
out over a' period of several years. Tte maasures will 
be available for use by Head Start In the fall of 1984. 

The HSHB was Detvmlopmd for Hmad Start Childtem 

The HSHB i»s designed^ specifically for use by the Head 
Start Program. In order to articulate . tlHi Masures to 
the goals and objectives of the Head ^tart Program, and 
make them appropriate for assessing development, several 
sources were' called upon to determloie the content of the 
teats. First, groups of parents and Head Start staff 
from every region of the country provided Input regarding 
Important areas to assess. Extensive lists of objectives 
were generated and refined. Second, the Head Start Performance 
Standards was an Important source for determining the 
content of the measures. Third, early childhood curricula 
were used In developing the measures. Fourth, the research 
literature In child development was examined. The research 
knowledge base made l.t possible to construct measures 
In which Items were selected to reflecft developmental 
sequences based on developmental theory. Fifth, linguistic 
and cultural advisors were involved throughout the test 



ERIC 



8- 

iO. 



development effort. They reviewed Items underdevelopment 
for potential bias and reviewed ^e measures at each stage 
of development.* Sixths national advisory panels comprised 
of Head Start personnel and experts In child development 
and assessment reviewed the measures development process 
and the field test design* 

The Head Start Measures Battery was administered during 
field tests to national samples of Head Start children 
from all regions of the country^ representing several 
ethnic groups Including Blacks, Hlspanlcs, Native Americans, 
and Anglos. The purpose of the field tests was to obtain 
information to be used to establish the validity and reli- 
ability of the measures, and to Improve the items. 

Content of the HSMB 

Language Scale. The Language Sc^le assesses skills 
necessary for effective communication." One major category 
Is Meaning In Language. The understanding of events in 
a story Is assessed, as Is the ability to generate verbal 
explanations, 

A second assessment category contains Items that assess 
syntactic skills. Words must be siequenced to form Intelligible 
sentences. Word endings mus^^be varied to reflect the 
plural, the comparative, the possessive, and the subtleties 
of meaning conveyed by verb tenses. Items tap each of 
these abilities in English. In addition, several items 
assess grammatical forms unique to Spanish. 

The third category addresses skills for communicating 
with others. This Involves, among other things, the ability 
to vary one's language based upon what one can assume 
the listener knows. Another Important skill assessed 
is knowledge of the social rules which govern interactions 
«iiich as greetings, turn-taking, maintaining and changing 
a topic, and asking questions. 

Kath Scale. The Math Scale focuses on children's skills 
in computation and in number and measurement concepts. 
The%f irst category. Working With Numbers, Inaludes recognizing 
numerals, counting, adding, and subtracting. In addition, 
items tap children's understanding of the fact that set 
size remains the same regardless of the arrangement of. 
objects within it. The use of numbers to indicate position. 



-9- 



for example, first, second, tliird is also assessed. 

The second category,/ Working. With Measurement Units, 
assesses knowledge of the names and functions of measurement 
units. Children ate asked to compare the size or sequence 
of length and time unlts^ • 

Ha tore and Scleace ^ulc There are two major categories 
In the Nature and Sc|>ence Scale* The first focuses on 
content or subject matter, and the second on the processes 
engaged In by scientists In gathering and organising infor- 
mation. 

The following kl^d8 of subject matter are assessed: 
earth and universe, weather and seasons, time/apace, mechanics, 
motion, energy, substancest ecosystems, hufian health and 
anatomy, life cycles, physiology, agriculture^ goods and 
services, transportation, and tools and construction* 

The process category assesses skills In observation 
of the physical and action characteristics of objects 
and animals, and their classification based on these charac- 
teristics. Another process assessed Is making inference^, 
and eitplalning causal events. Knowledge of relationships, 
recognition and construction of meaningful sequences, 
and prediction of the outcMie of an action are assessed 
i^n a variety of items* 

Percmptlcm Scale. The Perception Scale assesses children'^s 
developing skills in recognizing shapes and in uode A tending 
th^ spatial relationships among objects in the physical, 
world. 

The Shape category requires constructing shapes from 
parts and matching shapes which have been rotated. 

The Relations category has three separate subcategories. 
The first deals with objects In relation to each other. 
Items require constructing a three-part display which 
matches the model. A second subcategory asBesses children's 
ability to understand tha% alternative points df view 
of a single object are possible. The third subcategory 
involves matching or constructing a repeating pattern. 
Items require building a matching patte€ti or completing 
a repeating pattern. 



-10- 

12. ' 



Reading 'Scale/ A yarltety of prereading skills are^ 
assessed^ In the Reading* Scale. They ara divided lato 
two major categories: Visual Processing and Contextual 
Processing • ' * . 

\ ' . V^stia]l Processing Includes such skills as letter naming, 
choosl'l^ from a grcAsp of letters one which has already 
b^en ^een, * Identifying a l:or responding* upper or lower 
case letter, rhyming, dividing word^ Into their component 
sounds, and recognizing tlhat words (suc^i as chalk-board)- 
» , .may be comp&sed oi other shorter words. 

- ' Contextual ^Processing on the other* hatrd requires that 
the c'hild utilize previous knowledge or experience rathAr^ 
than rely primacj.ly upon the printed text. Contextual 
Processing tasks Involve asking the ch'lld to recognize 
his/her own name, and to till In an appropriate word In 
an Incomplete sent;ence. ^ * . 

« 

Social DairelopMat Scale. The Social Deveh)paent Scale 
^' assesses the child's knowledge of social relatlpnshlps . • 
It 14 divided Into three categories: Social Roles, Social 
Rules, and Feelings. ^ ^ 

The .fUsAt. th«^ Social Roles category, Ups the child's 
ondersta^^ of the requirements and expectations Involved 
In the. ¥w^s of the leader, the buyer and the teller, 
and the owner of^both tangible and Intangible property. 

The Social Rules ca tegory assesses the child's understanding 
of the concept of fairness, as It relates to the allocation 
of rewards. The understanding of turn-taking, and of 
helping others and sharing is also assessed. 

The third category. Feelings, asks the child to Identify 
feelings frop facial expressions and to predict the feelings 
of the recipient of certain actions* 

Pilot data were collected during the spring 198^ ' 

for two additional social development categories^ The' 
firpt assesses self concept with respect to cognitive 
skills and the second taps classroom social skills.^ items 
In these categories are being incorporated Into the item 
bank for the HSMB so that they may be used as needed in 
fiHure versions of the battery. 



-1 1- 

13 



JkdBlDlstrAtion of tbe HSMB 

T 

. The Head Start lleasures Battery was designed to- be 
-easf to adaialster so that it can be adalais tared by teachers. 

It also has been adninistered successfully by many para- 
• professionals. Although prior testing experience is desirable, 
it is not necessary, since aost itens enploy a. simple 
1 or 0 8.<toring system. 'A training package consisting 
of an Bnailner's Manaai and 1/2" video cassettes is available 
for self- training. In addition, a Data Collaetlon Txminlng 
Hanua^l details procedures for training petaonnel to admin- 
ister" the HSMB. * ^ 

Each scale of the Head Start Measures Battery is 
individually adainis tared. Adfliinistratloa tine ranges 
ftom eight to tifelre minutes depending on the scale, and 
the particular child being tested. Rules for starting 
^nd stopping the adainiatra tion of apecified aubacalea 
serve to shorten t^e as well aa to avoid administration 
of very difficult items to children at low achievemenft 
levels and very easy items to children ft high achievement 
levels 1 

Sometimes young children Aave knowledge that they do 
not demonstrate in a testing situation for a number of 
reaaon^t Including lack of rapport with the examiner, 
or lack of understanding of what is required in the task 
or of the language of testing. The HSHB outlines specific 
procedures designed to maximise children* a performance 
on the Battery and make the testing experience an enjoyable 
one. Many of the ^at items are preceded by demonstration 
and practice items that serve to show children exactly 
what bs expected of them. Techniques for building and 
maintafning rapport and providing encouragement are apelleds. 
out. In field tests, children generally enjoyed the activities 
In the Battery and were usually eager to '•pday ga^es again** 
wl th the^ examiner . 

The HSMB la for Emgllah an4 Spmmiah Spmmking Chlldrea 

< 

An impo r ta n t f ea ture of the HSMB is tha t i t a I lows 
for linguistic diversity and thu^ yields more accurate 
Information about children's knowledge. When children 
are tested in a language they do not fully comprehend » 
they may miss Items, not because they do not l^ve the 
knowledge but because th^v do no t^uild^x stand the examiner. 



Testing a child In the appr^Hriate language or Languages 
helps to ensure that results will provide a valid Indication 
of what the child kaows. «^ . 

Approximately twenty percent of the Head Start population 
Is Hispanic. While the mjorlty of the Hispanic children 
ar^ of Mexl9an-Ameplcan background , Puerto RloanSt Cubans, 
•and other Hispanic groups arc also represented^ One goal 
of the Measures development effort uas to produce measures 
capable "of adequately assessing competencies for each 
of these groups. To this end, iJie measures were developed 
4 In both Spanish and English, and the SpanliTh * verslbns 
were constructed so that they wbuld be appropriate for 
speakers of different varieties of Spanish* 

The « Spanlflrti and English versions of th^ ^asuries are 
both contained"^ in the test manuals so that it Is not necessary 
^ to obtain a separate mnual for each language. In» most 
cases Spanish and English items j^e^ equivalent; however , 
there arc certain areas of competence In one language 
that do not exist In the other 'language* In si|ch cases, 
separate sets of Items were developecl; In the Language 
Scale, for example, the English version contains a subtest 
of grammatical structures that exist only in English a^d' 
the Spanish version contains a subset of gramfiatlcal structures 
that exist only In Spanish. 

The Kxaainer'a Hannal outlines specific procedures 
for detertnlninfi: whether a child should be tested In English, 
In /Spanish, bJllngually, or not at all. The procedures 
are consistent with those rec<»nmended for use with the / 
Head Start bilingual curriculum models that have been 
developed. The HSMB Is not Intended for use with children 
whose dominant language Is neither English nor Spanish 
because val id results cannot be expected with children 
who are not dominant in one of these two languages. 

Resulta of the 1982-83 Field Test 

Results of the field test demonstrate the sensitivity 
of the HSHB to growth made by children who participated 
in the Head Start Measures Project. Gains made between 
pre- and posttestlng are presented in Figure 2 In Chapter ^ 
III. The testing Interval was approximately seven months 
for most or the measures, but only about four months for 
language. Data analysis indicated that the amount of 

15 



ERIC 



growth made by the .children was edocatlonally ffleaaingful 
and cannot be attributed solely to aaturation* While 
ag^ was a factor related to gains. Head Start program 
variables also played a role in deterniiilng the levels 
of achievement reached by the spring. 



played 
eaoihed I 



Sl^iflcaBt PaychMWtrio FMterea 

The paychosetric properties of the spring 1984 version 
of the HSHB are discussed in details in Chapter III. The 
present section* briefly outlines particulalrly significant 
psyctKHMtrlc features of the Inatrusents. The psychometric 
properties of the HSHB were established (through the tise 
of classical i tem ana lys i a procedures sind newer la ten t 
variable tech^ilquea described later in the sumary* 



Reliability ; In any aaaessaent program, it ii 
essential to know the extent to which a teat score 
^provides a consistent measure of a child's perfor- 
mance # Cons i s tency is de term ined by ca 1 cu la ting 
test reliability. KR 20 reliabilities ranging 
from .83 to' .92 demonstrate the reliability of 
the scabies of the HSHB. 

I tern Discrimination : The extent ''to which items 
can discriminate among children of varying achievemenit 
levels is an important consideration in test con- 
8 true tlon . Xia tent tra i t procedures were used 
to calculate item discrimination valuea* Those 
items that are sensitive*^ different levels of 
ability were selected for /the HSHB. 

Item Difficulty ; Each s :ale of the HSHB contains 
iteras^ with a broad railge of difficulty levels 
appropriate \for chiIdren\of different ages and 
developmental * levels. Items were Included only 
1 f f lifiy ecu Id be passed by children at one of 
more of the developmental levels fpr which the 
measures were designed. 

I tefe Information ; Items vary in the amount of 
tnf orma tlon tha t they provide about a child's 
ability. Test length can be kept to a minimum 
when Items with high information values are used 
In tea t cons true tion . The amount of inf orma tlon 
that items contributed to the total teat scor 

. -14- 



was one, criterion for selecting them to be included 
in the final version of the HSMB. 

o Item Bias ; ^^ree methods were used to predict 
and detect potential item bias: judgmental review* 
comparrisons of difficulty values* and comparisons 
of item ^ characteristic * curves. Four groups were 
s tu4ied: Blacks , Spanish- dominant His panics* 
English-dcHninant Hispanics^^ and Anglos. Analyses 
showed few. items, to be potentially biased* and 
no one ethnic group was favored. 

^ * Content Validity : Cultural advisors asserted 
the appropriateness of HSMB content for children 
from different ethnic backgrounds resulting In 
the elimination of Items fwith potentialLy^biased 
contents RSViews by experts established the appro- 
vpriateness of the content of each scale and the 
r^t^hnlcal quality of the Ite^^. Additional evidence 
of content valid^lty was provided by Head Start 
teachers. Results of the field test demonstrated 
that all of the items reflect skills taught l<f 
Head Start. 

^ Construct Validity ; Several analyses were conducted 
that established the fact that the HSMB assesses^ 
development..^ Studies were conducted that empirically 
validated developmental sequences for each of 
the scales in the HSMB. These developmental sequences 
were based on research and theory In child development. 

Scoring a\^ Reporting Serrlces 



§core sheets are currently available for optical 
scanning a t The University of Arizona *s Center 
for Educational Evaluation and Measurement. Score 
sheets can be developed for hand scoring by individual 
grantees If needed. 

Repor ts are provided summarizing peorformance and 
developmenta l progress for Individual children * 
and classes. Reports make it easy to relate 
chlld^s developmental level to the leve^ of classroom 
Instruction. The teacher using the reports can 
plan learning activities based on the child's 
developmental level and can evaluate progress 



ERIC 



i^- 

17 



assoc^^ed with Head Start expeciences* 

Scores and Profiles for the HSHB * 

A iiath score and path profile are prov^ldcd for a child 
on each of the six scales of HSMB. The path score represents 
the child's position on a developmental path. It describes 
the child's overall level ^f performance on 'each scale 
and^ In addition, generates V profile of the child's leVel 
of achievement on each subscale. Tfie* profile indicates 
what types of learning activities are easy for the child, 
what activities should be challenging to the child at 
his or her current level of development, and what activities 
are prob'aSly still too difficult for the ichlld. 

Figure I Illustrates the Teaching and Skill Developmental^ 
Profile Report for the Counting Subscale of the Hath Scale, 
Notice that the •ajcllls are ord^ed by difficulty. The 
uppermost skill Is the easiest and the lowest skill la 
the most difficult. Mote also that the path scores for 
each skill are partitioned Into three categories: nonmastery, 
partial mastery, and mastery. The higher the path score, 
the more likely the child Is to be able to master increasingly 
difficult skills. For esCamftle, a child with a path score 
of 52 Is considered to be a master of the first three 
skills In the Counting Subscale, a partial master of the 
next two skills, and a nonmaster of the last two skills, 
an the other hand, a child with a path score of 67 Is 
able to accurately perirform all the skills In thl^ subscale* 
Partial mastery can be Interpreted to mean that the child 
performs the skills In some situations but not In other 
situations. The teacher may want to provide additional 
learning opportunities with these sklll^i In order to Improve 
the child's degree of mastery. 



-16- 



ERIC , IS 



Taachias «Bd Skill P^file Report - Hath Scale 



Site: • 



MMtery:L ... . . 







<<>u#itinK .Mir lotKi tti a mj|»t>»r t»rtv«4>ii il snd^^O 
(fitiiitfnK t»#»m t<» a ninh^r ht^tifr^n h and in 


^Btmam .sv«!vxx\ 



'V 



45.0 



Figure 1. Math Scale Profile Report: Counting Subscale 



BEST COPY AVAILABLE 



-17. 



ERIC 



19 



Using HSMB Resulta to Plan for Learning P 

The measures can be very useful to educational coordinators 
and teachers In planning Individualized and group learn- 
ing activities to enhance children' s • develofwent. They 
are^lso useful In^ assessing developmental progress and 
In evaluating the apprdpvla t«ness of activities. Fall 
admlnistra tloh of the ircasures taakes It possible for teachers 
to base long*- range plans on the entering skill levels- 
displayed by children. They are provided the necessary 
Information to avoid the pitfalls of targeting learning 
experiences on skills tha t children already possess or 
of focusing learning activities at developmental levels^ 
too far above the current leyels of functioning of th^ 
children. Spring administration provides an Indication- 
of the extent to which educational objectives have been 
achieved and^ when combined with Infotmatlon from the 
fally details developmental gains occurring during the 
course of the year. This Information Is Important for 
subsequent educa tlonal planning. HSMB results can be 
used to accomplish the following: ^ 

o Qescrlbe a child s current performance level. 

o Create class and program profiles 9 

o « Revise educational giMls, ' 

o Plan learning activities at appropriate levels^ 

o Determine for Individuals and for classes how 
much growth has been made over a year» 

o Assess whether educational goa Is have been achieved. 

A teacher Plamnlmg Guide Is provided with the HSHB. 
The Gaide enables teachers to relate the level of teaching 
planned for (he child to the develofwental level of the 
child. By u^ng the Planning Gmlde teachers <^an determine 
whether the planned and actual teaching levels are above,/ 
approximately at, or below the child's developmental level. 



18^ 



20 



I 



laplMenUng the H91B 

A series of manuals accompanies the HSMB. The Teclmical 
Naanal describes the psychometric properties of the Battery. 
The ExMloer's Nanoal outlines procedures for administration 
and scoring and explains how to interpret results. The 
Data Collectiofi TraiBing Mamwil provides procedures for 
training and monitoring. 



-19- 

ERiC 



Ill 

u 



MEASOKES DSVELOPHSn AMD KVALOATIM 



-The llMa«r«« D«Tclop«eat Process 



* As described briefly In Chapter 11^ sereral features 
characterized the process of developioi; the HSNB. A funda- 
mental characteristic of the Measures is 'that they are 
based on child developaent theory and research. Such 
theory and research provided inforsation on the nature 
of children's cognitive and social skills and the kinds 
of changes expected in these skills during the course 
of development. A second key feature of the neasures 
Is that they were designed speci^f ically for the culturally 
diverse Head Start population. Systeaatic consideration 
of cultural (differences and the detection and ellslnatlon 
of potentially biased itens at several phascA of measures 
deveLopraent vere, the procedures used to help ensure fair 
assft^ment of c|/ildren froa varying haclFgrotmds. Spanish 
versions of the measures were developed ao that Spanish- 
preferring (or Spanish-dominant) children could be assessed. 
Another feature of the neasures is that the content of 
the HSNB was geared toward the educational goals of Head 
St^rt programs. Therefore, extensile and systematic procedures 
were ii^plemented early in the project to obtain Input 
from the Head Start cosatmity. This section describes 
the path-referenced measures construction approach and 
the methods employed to mak6 the HSMB culturally appropriate. 

Path-Referenced MeasareV^Constractloa Techmologji 

development of pa t^ referenced assessnent Inatrmaents 
for Head Start began with the constr4Ctioa of theoretical 
models of the structure of conpetence In each of ^tht^ six 
areas targeted for assessment: Languagep H^th, Nature 
and Selene^, Perception^ Readings and Social Development* 
The theoretical models are referred to as **d<nMin structure 
models/' The domain structure isodel provides an ojrganlsed 
theoretical fram^|prk for constructing a aeasure containing 
Items that adequately represent a content area and. In 



c 



-20- 



22 



addition, tap a range of levels of development. Each 
scale of the HSMB can be thought of as providing a linear 
sequence that quantifies a child's position on a "path" 
of developaent. Two types of analyses were carried out 
to create the domain structure models: (1) content category 
analysis and (2) developmental structure analysis. 

Aaalysla. - Content category analysis 
was used to identify the types of content included in 
the scales. Each of the six broad areas of competence 
was divided into categories. For example, the area of 
nathematical imowledge was divided into working with numbers 
and working with measurement units. These cai^gories were 
divided into subcategories. Working with numbers, for 
example, was divided into two subcategories, computation 
of numbers and identification of math symbols. This process 
of subdivision continued until a set of task ^trands, 
e««h including a number of tasks sharing a common goal 
atid involving an organized set of processes directed toward 
attainment of that^goal, was determined. For example, 
addition tasks share a common goal, the summing of numbers. 
Thusj addition tasks make up a task strand. Task sttadds 
form the building blocks from which the developmen tally 
sequenced measures were constructed. 

Dm'lopMntal Stractim Aaaiysls. Following the content 
analysis, an analysis of developmental structure was undertaken 
to determine ,the sequence of skills within each of the 
six content areas. The ordering of skills is, based on 
twp factors that ar^ specified and thus built into the 
items that make up a particular measure; (1) variation 
in the objects used to present tasks and (2) task demands. 



Variations in the objects used in presenting a task 
may affect the way in which a task is approached by the 
child. In the case of addition, for some items the numbers 
to be added are presented in the form of concrete objects 
such^aa blocks. Under these conditions, the child could 
add by counting all the blocks. In other Items, however, 
numbers are presented li^ the form of verbal symbols. 
Under these circumstances, a child might add by relying 
on verbal knowledge of addition facts. Thus, whether 
or not a task involves blocks or verbal symbols contributes 
to determining where In a sequence of skills the task 
will fall. 



-21- 



erJc • , 23 



Another way In which the sequence of tasks was determined 
was through the specification of task demands. A task 
demand is characteristic of a tasfe that affects the dlf f Icul ty 
of performing the task. Task demands affect the cognitive 
processes necessary for task performance. For example ^ 
the addition of two numbers with a sum less than ten requires 
fewer steps , and hence fewer processes » than does an addition 
task that requires carrying. Thus, the demand of carrying 
influences the difficulty of the task because tasks that 
have this demand require more steps, and hence more processes, 
than do tasks that do not require carrying. In addition 
to affecting complexity, . task demands affect the types 
of rules that are requiKed for task performance. Foir example, 
when children first learn to count, they may operate under 
a rule that assumes that counting always begins with the 
number L. As development progresses, they may replace 
this with a rule indicating that counting may' start from 
any number. The latter rule is more ccmplex because it 
replaces a single starting value with a whole class of 
values. These methods of analysis were used to order 
tasks within task strands, thus creating the developmental 
structure that Is an essential feature of the path- referenced 
measures. 

Cultural and Lli^alatlc Coaaldera tions Is Taat CiMaatnictiim 

A number of procedures were implemented to construct 
a battery of measures appropriate for a culturally and 
linguistically diverse population of children. In addition 
to post hoc analyses of bias, the prevention aofit ^elimi- 
nation of potential bias was an Integral part of the measures 
development process. 

Several procedures were utilized during the development 
of items. First, procedures required the specification 
of cultural considerations for each item. For example. 
Items were designed so that alternative methods of demon- 
strating skills among different cultural groups would 
be accepted, and cultural differences in item administration 
procedures were a)Mo considered. Second, verbal responses 
by the child were kept to a minimum on most of the scales 
Vand many items require non-verbal responses. Third, both 
Spanish and English versions of the HSMB scales were developed 
so that the measures could be administered to Spanish-speaking 
children. 



-22 



24 



ERIC 



The development of the Spanish version of the HSMB 
Involved more than a simple translation of Items from 
English into Spanish. First, there are areas of content 
that exist In one language but not In the other* In such 
cases » Items were developed separately for the Spanish 
and English versions. Second, since Spanish-speaking 
children In the United S^tes speak several regional varieties 
of Spanish, It was Important to ensure that the Spanish 
words and phrases used in the measures be. comprehensible 
to children from different regions^ For Spanish terms 
that differ by regiba, alternative ^orm^m^e provided 
In the measures. Systematic Mclollnguls tl^^Hft procedfires 
were Implemented during the field test to |H||^in format Ion 
on the adequacy of the Spanish versions flfd the results 
were used to revise the measures after both fall an^ spring 
admlnls tra tlons • 

V 

r. 

In order to avoid potential cultural bias, cultural 
bias reviews were carried out before Item tryouts, after 
the fall 1982 field test, and again after the spring 1983 
field test. The reviewers assessed potential bias against 
Blacks, Hispanics, and Native Americans. Item characteristics 
assessed were content familiarity, stereatyping, relevance, 
cultural meaning, of fensiveness, and value assumptrM)«. 
Overall te^ t characteristics assessed were visual and 
name representation of minority groups, ethnicity of main 
and secondary characters, representation of activities 
and cultural characteristics of various ethnic groups. 
Results of the reviews were used to revise the measures 
during the developmental process. In addition to the 
Judgmental reviews, statistical analyses of potential 
item bias were carried out after fall and spring field 
tests and results used to eliminate items on which groups 
performed differently given the same overall ability level. 



ERIC 



Field Test and Resolts 

"^he national field test was conducted In the fall of 
1982 and the spring of 1983 In order to try out the measures 
on a large sample of Head Start children and to gather 
data that could be used to establish the validity, reliability 
and other characteristics of the measures. Infoirmatlon 
from the field tests was used to revise the measured after 
both the fall and the spring administrations. This section 



describes the sample of children who participated In the 
field testt data collection procedures, the aiaoutit of 
growth made by Head Start children bepieen fall and spring 
testing, the psychometric properties of the HSMB, and 
results of the analyses of the relationship between selected 
^program variables and achievement. 

The Hatlonal Pimld Tmtft Swple 

Approximately 1400 children from 19 sites were tested 
on the fall ^field-test version of the HSMB In the fall 
of 1982* In the spring of 1983, approximately UOO children 
In 14 sites were tested on the spring fie Id- test version. 
Children In the sample represented four major regions 
of the nation: the Parwest, Midwest, Southeast, and North- 
east. Three community types were represented: rural; 
urban and suburban; and urban, suburban, and rural. 



Table 1 

Etlmlc Cottpoaltlon of Hatiooal Field Test Saaple 

S^liig 19S3 



Etlmlc Group Percent Percent in Head 

in Sesple Start Popeletioe 



BUck 


30 


42 




14) 


20 


Hispsaicy English dMinant 


19) 




■atlve AMTlcan 


15 


4 


Aaglo 


21 


33 


Other " 


I 


1 


Total 


100 


100 



: ^ ^ 

Table 1 presents the percent of children In the spring 
1983 sample by ethnic group* Children were selected from 
five major ethnic groups: Blacks, Spanish-dominant Hlspanlcs, 
English-dominant HlspanlcS, Native Americans, and Anglos. 
The total percentage of Hispanic children In the sample 
(33%) vtBB larger than their percentage In the Head Start 
population (20Z) . This type of sampling plan vas necessary 



-24- 

26 



In order to gather a sufficient anlount of Information 
on the Spanish versions of the sea las. The percentage 
of Native Americans In the Sample (115%) also exceeded 
their represenutlon in tfie popula tlon|\4%) • The J^rger 
absolute numbers were needed in order to obtain informa- 
tion on how Native American children respond to the measures. 
The oversampllng of Hispanic and Native American children 
with respect to their representation in the population 
necessitated the undersampllng of Blacks and Anglos; however, 
the absolute numbers of BlaclCs and Anglos were adequate 
to gain the necessary information on their responses to 
the measures. * 



Table 2 

Age Ranges of Childraa In Hational Field Test Sanple 

Spring 1983 



Age Range 



Nnaber of 

Children 



Percent of 
Children 





« 




3-$ to 3-11 


96 


16 


4-0 to 4-5 


155 


26 


4-6 to 4-11 


( 348 


56 


Total 




100 


Level II 






5-0 to 5-5 


365 


89 


5-6 to 6-0 


47 


11 


Total 


412 


100 


ToUl Tested 


1011 





Note. Children were 6 aontha fotmger wh^n tested in the 
fall of 1982. 

» , f 

Characterlatlca of Children. Table 2 presents the' 
number and percent of children in various age ranges for 



ERIC 



27 



: »vr<»i; L ,\n*i li at the spring 19M } field-test version 
ot the HSMB. Ihe ages ot the children In the spring .1983 
sample ranged from 3-6 to 6-0 years. While age data are 
only reported for the spring. It should be noted that 
children were about six months'^ younger for fall testing. 
The younger children were all attending Head Start, During 
the 1982-1983 field test, each scale was divided Into 
two levels. Level I waa used with children ranging In 
age from 3 years to 4 yeJrrs and II months. 'Levtel II was 
used with children over 5 years of age. Nlnety-slx percent 
of the children who took Level I were Head Start children, 
and 84% of those who took Level II were in Head Start. 
Of the 599 Head Start childreti assessed on Level I, 58% 
were between 4-6 and 4-1,1. At Level JI, 89% of the children 
assessed were^between 5-0 and 5-5. Small numbers of elemen-* 
tary-school-aged children were included in the sample 
to fliake it possible for the HSMB to be used to document 
the progress of Head S^rt children into elementary school. 

Pamilly Ckarmcteristlcs* Descriptive data were gathered 
on families 9 classrooms, and teachers of the^chlldren 
In Head Start classrooms. Data on family size, education, 
Income^^ and occupation were gathered both from families 
and from Head Start program records. 

o The mean number of family members for children In 
the sample was 5; 

o Approximately half the children were from two-parent 
homes, and half were from single-parent homes; 

o The mean annual income was $6,182; 

o While there was some variation in the educational 

1 eve I of mothers , 362 of the 855 sampled owthers 
(42%) reported that the I2th grade was the highest 
^rade^ they h^d completed; 

o The occupation reported for the largest single group 
of mothers was service worker (59 of 151) while laborer 
was the occupation roost often reported for fathers 

(85 of 257). 

Staff Charac^efla^tlca. A short Classroom Staff Question- 
naire was a,dmlnl9tered to the teachers of all Head Start 
children in the sample to gather descriptive Information 



-26- 

28 



on education, training, and. experience « 

^ The mean nimiber of years of education for sampled 
Head Start teachers was 13.8; 

o The mean number of months of CDA training reported 
by Head SUrt teachers was 10* 34; 

o 36% of the teachers reported having tfie QDA credential, 
29% were working toward the CDA credential; and 66% 
reporting having had CDA training; 

o 26% of the teachers had state teaching certificates; 

o In general the teachers In the sample were experienced 
with a mean of 6.8 years of experience as a Head 
Start teachep. 

Class Chirac t;er is tics. Information about the cofflpositlon 
of classrooms was gathered through direct observation. 
The mean Head Start class sise for children in the sample 
was 17.16 children. There was, on the average, one teacher^ 
one aide, and one volunteer per classroom. In addition, 
there was a mean number of 1.67 children with special 
needs in the class and the mean number of limited English 
speaking children was 1.95. The mean number of ^limited 
English speaking children per class ranged from 0 in several 
^ sites to 12.64 in one site. Children with special- needs 
V and children who were limited in English proficiency and, 
^•^«^te a language other than Spiialah were not assessed 
in the field test. ^ 

Deta Collection 

DaU Collection HanageMnt. Fall data collection was 
parried out by Mediax Associates. *a Connecticut-based 
research firm. Spring; data collection was Implemented 
V^by The University of Arizona. In both the fall and the 
spring, data collection was supervised by a data collection 
manager who supervised site managers working In the field. 
On-site data collec*t4on was directed by site managers, 
each of whom directed and monitored the activities of 
Jjfi^ two to twelve data collectors. The group of site 
managers included many university teaching staff and experts 
In child development. ^ 



■27. 



ERIC 



29^ 

^ 4 



Site managers attended an intensive one^week training 
ses'sion and were provided with a comprehensive oanual 
of instructions. They th^n Interviewed, hired, and trained 
data collectors, . each of. whom had to he officially desig- 
ns ted ^''S^ adequately proficient in raeasure*^ aS^ffiKstration 

before testing children. / 

i 

Data Callfictor Qna 11 float ions. Data collectors who 
carried out the testing were paraprofessionals y^ho iMt 
the following criteria: 



o Familiarity with Head Start and its goalsT 
* 

\ O" Experience working with young children; 
o A high school dipl 



o l^Prof icl^ncy in reading, speaking, and understanding^ 
Spanish (at sbae sites). 

Preparation for Maaaoraa Adalaiatratioa. The local 
sample of children was selected with the coopers |iion of 
the Head Start Director. Parent permission was obtained 
for children chosen to participate in the project. Special 
needs children Identified by Head Start personnel as having 
physical or mental handicapping conditions such that a 
valid assassisent could not be obtained were not included 
in the field- test sample. In addition, children who were 
limited in English proficiency end did not speak Spanish 
were^ not assessed. The language(s) in which Masures 
were to be administered to each*. child was determined by 
gathering information from parents, teachers, and aides 
about children's language use patterns. A language preference 
rating was assigned to each child and the measures were 
administered either in English, Spanish, or biliagually, 
as designs t^43U^ Under field tes6 conditions the bilingual 
examiners were permitted to switch languages if, in their 
opinion, the child's performance indicated that the language 
preference rating was In error and more valid results 
were obtained using the other or both languages. 

Adsilalstratiot&of tbe MaaamTms. Measures were administered 
individually to each child in a location close to but 
separate from the classroom. Examiners followed strict 
guidelines about building rapport with the child, making 
thf* experience pleasant, providing positive comments^ 

-28- 

ERJC ^ 



and stopping Vaa administration if the child became tired 
or disintereated. 

Monitorlag and QiMlitf Cmtrol* The^ data collection 
manager and dupefvlsora monitored site managers In the 
field* Site manager^ In turn monitored their data collectors 
on a daily basis. .^Monitoring and quality control on site 
involved: (1) using a Monitoring Form and Examiner Competency 
Rating Form to monitor each measure; (2) shadow scoring 
of each data collector on each measure to ensure 93% accuracy 
in scoring; (3) revievlng all protocols for completeness 
and correctness; and (^) ma-intaining communication to 
identify and solye problems. 'Quality control procedures 
Implemented at the Universl ty of Arizona for incoming 
da ta included visua 1 Inspection df all protocols for: 
completeness; appropriate test level for . child's age; 
proper codes for site , center , class, data collector; 
administration time; and language of .ad^lnls^tra tlon. 
The use of a quality control computer program served to 
* flag data collectors with problems and to assess the qualicp 
of the data files. File accuracy was maximized through 
^^isual inspection and correction of errors. 

Achleveiemt Rmaults * . e 

The administration of the HSMB at two points in time 
during the year served several purposes. First, it afforded 
the ^opportunity to determine whether the HSMB is sensitive 
to growth made by children as a result of participation 
In Head Start programs. Second, it was necessary in order 
to establish that the difficulty level of items was appropriate 
for both the youngest children in the fall and the oldest 
children in the spring.- Third, data from fall and spring 
administrations provide baseline data that could later 
be used to determine whether larger gains result when 
teachers use HSMB results to improve educational activities. 
Results of the field test demonstrate the sensiitivity 
of the HSMB to growth made by children who participated 
In the Head Start Measures Project. Gains made between 
pre'- and posttesting are presented la Figure 2. The testing 
tnterva I was approxima tely sev^n months for most of the 
measures, but only about four months for language. Data 
analysis indicated that the amount of growth made by the 
children was educa tiona I ly meaningful and cannot be attributed 
solely to maturation. While age was a factor related 
to gains. Head Start program variables also played a role 

-29- 

31 . 



k t 

in determining the lelrels of achievement reached • by the 
spring^ 

PsychoMtric Proper tiea of the HSNB y 

The psychometric properties of the HSMB were established 
using both classical psychoioetric procedures and never 
latent trait techniques (Lord, 1980). Latent trait |>roc6dures 
describe a child's competencies in a given content area 
in terms of a continuous quantitative scale reflecting 
the child's level of competence with respect to the ability 
or trait being f^eastitedU Latent trait techniques are 
ideally suited to the task of measuring development. 
A child ' s score on a latent trait scale specifies the 
<:hild's level of development*^ The score indicates the 
skills the child has mastered and those that must be learned 
in order for the child to progress. 

Latent trait tec^hnlques proved to be useful in a number 
of ways in constructing the HSHB. In developing the battery^ 
it was necessary to construct scales for children from 
three through six years of age. This requlVed that items 
appropriate for older children be placed on the same scale 
as items appropriate fpr younger children.' Latent trait 
procedures were used with each scale in the battery to 
place a pool of items for young children on the same scale 
as a pool of Items for older children. * 

The construction of the HSMB called for the generation 
of scales that can be administered in a relatively short 
time span* The test length required to accurately assess 
achievement depends on the amount of Information that 
each of the items in the test is capable of conveying 
about a child *s achievement level. Latent trait technology 
includes procedures fdflf determining the amount of li^ormatlon 
contained in test items. These procedures were used in 
the construction of the HSMB to develop scales providing 
'1 ma X imum ^moun t of Inf orraa tion abou t chi Id^n ' s sk 11 Is 
with a minimum number of test items. 

In order to ct;^ate an adaptive measurement system capable 
of accommodating \ local program needs and a changipg Head 
Start program focus, it was necessary to construct the 
HSMB scales in a manner that would allow old items to 
be 



30- 



9 



Figure 2. Pre- and Fosttest Path- Scores on the Head Start 
Measures Batter/ - National Field Test, 1982-82 



55 



50 



J\ 



♦5- 



40- 



33n 




Pre Post Pre Post Pre Post Pre Post Pre Post Pre Pont 

(aiiguHgp Mflth Nat. & Scl. Perci'ptlon Reading Social Dev. 

N - 6f>5 N - 904 N - 967 N - 687 N - 697 N - 888 



BEST copy AVAIUBLE 



31- 



ERIC 



33 



deleted and new Items to be added to the existing scales. 
Latent trait technology affords the capability to add 
and/or delete Items to an existing set of scales. This 
feature of latent ttait techniques was used extensively 
in the refinement of the Battery and can be used in the 
future to ensure that the Battery will remain up to date. 

Data documenting the psychometric properties of the 
'ISMB are presented in the HSHB Tecfaiilcal HaiuiaI. A summary 
of psychometric^ analyses is presented In the followtt^g 
paragraphs. 

Item Dlf f lctilt]fi>^ In order to accurately assess the 
v^arying skill levels represented by the Head Start population, 
a wide range of difficulty values is essential. For example^ 
If the measures contained only very easy items, they would 
not be useful in assessing children with advanced skills. 
Likewise, if the measures contained only very difficult 
it?ms, they would not be useful in assessing the competencies 
of children in the early stages of developmeiit. All of 
the scales in the HSHB contain an appropriate and wide 
range of difficulty values,^ indicating that they provide 
accurate assessment of the varying skill levels represented 
in the Head Start population. 

Item difficulty values for the Head Start [Measures 



theory, item difficulty is defined as thfe proportion of 
individuals who pass an item. Uhen proportion correct 
is used as an indicator of difficulty, difficulty varies 
with examinee ability. When an Item is given to high 
ability examinees, a higher proportion will pass It than 
would be the case If the item were given to low ability 
examinees. This would describe the item as being easier 
than would be the case if the item were, calibrated on 
a lower ability group. Latent trait estimates of difficulty. 
In contrasf to classical estimates, do not vary as a function 
of examinee ability. This was an advantage In constructing 
the measures because It resul ted In the construction of 
an entire scale that Is independent of examinee ability. 

Item DIaerialna tion. I tern discrimination refers to 
the extent to which an Item Is sensitive to the different 
ability levels of children. In latent trait technology 
Item discrimination Is defined ;in terms of the assocla- 



Battery were calculated 
latent trait estimation 




techniques and 
classical test 



ERIC 



34 



I 



tlon of changes In ability with changes in the probability 
of a correct^^response. For ekaraple, a highly dlacrlmina ting 
Item is one for which a small change in ability will produce 
a measurable change in the probability of a correct res- 
ponse. By contrast, an Item that does not discriminate 
well between ability^ levels is one for which a large change 
In ability is required before a measurable change in response 
probability occurs. Items were selected for the HSMB 
that discriminated well for Head Start children at all 
ability levels. The majority of the items In the HSMB 
have high discrimination values Indicating that they are 
sensitive to^hanges In ability. 

Reliability. The reliability of thfi measures was assessed 
using the Kuder-Rlchardson 20 coefficient. The K-R 20 
provides a measure of internal consistency based on the 
average intercorrelatlon among Items. Internal consistency 
Is important in that one would have little faith in a 
measure that yielded widely discrepant scores for different 
subsets of items all assumed to be , on satae scale. 

The HSMB was found to have K-R 20 values ranging from 
.83 to .92, indicating good levels of reliability. Table 
3 displays the reliability coefficient for each of the 
scaJles In the HSMB. 




ERIC 



-33- 



1^^ 



Table 3 

K-R 20 Rcliabllltjr Coefficients 
for the 'Sl% Scales of the HSMB 





Orig. No. 


No. of itnss 




^j»Jh1 A 


OZ XZlCMB 


on Spring 1984 version 


K»R ZO 




79 




.90 


Hath 


114 


59 » 




lb tore & 








Science 


84 


51 


.83 


Perception 


1^4 


25 


.83 


Reading 


57 


40 


.91 


Smiial 








Peveiopwent 68 


40 


^84 > 



T 



Itev Bias. Item bias studies %fere conducted after 
the fall 1983 and spring 1984 field tests. Analyses were 
conducted for four groups: (1) Anglo^. (2) Blacky (3) 
Engllsh^domlnant Hlspanlcs, and (4) Spanish-dMiinant Hlspa- 
nlcs. The purpose of the analyses was to determine If, 
after controlling for ability level, groups performed 
differently on an item* The three technique^ used to 
determine if bias existed were 

<^ jjMtiges' ra tings > 

o comparison of item characteristic curves (ICC's) 
showing the relationship between the probability 
of a correct response and child ability (Lord, 1980), and 

o comparison of Item difficulties. 



Overall, the results of the investigation revealed 
only a small kmount of bias. Moreover, there were exceptional-* 
ly few Insiaoces in whicft Che items on a scale favored 
any group^^over other groups* In those instances, the 
determination of bias vindicated circumstances* in which 
one group ^^^sponded dif f ercntly " to an item than another, 
but was nor necessarily put at a disadvantage by the differ- 

-34- 



ERLC 



36- 



■■("■ 

«ntUl r«*pondliig pattern. kor example, soae Iteas were 
£o«in| to be Bore difficult for a group when coaparing 
students at high abllltr leVela but easier for the group 
when coaparing students at lov ability levels. Differences 
of this kind tended to occur aost often irith old«* children. 
At each stage of ■easures revision, aanjr potentftlly biaaed 
i teats were eliainated fro* the Battery. 



ERIC 



Validity. In order to deteraine the potential 
ttsefnlnees of the aeasur^s batUry in Head Start, it was 
necessay^ to address questions concerning the extent to 
which the content of the aeasurcs reflected the coapetencies 
that tha aeasures were supRosed ta assess. The estab> 
lishaent of content validity maT-addMssed through reviews 
of both the conceptual papers and theNiMsures by experts 
exumal to the project. Inforaaticm frW these reviews 
was utilised durlng^tha aeasures revision process. Additional 
evidence of content validity was provided by Head SUrt 
teachers. OaU obuined through use of the Plami^ Gaide 
indicated that all of the skills assessed on ^ HSMB 
were taught by ao«e proportion of tl|e Head Start teachers 
la the aaaple. In the area of Ungnage, for exaaple. 
about a third of the skills assessed on the Unguage Scale 
were to bd taught by between 61Z and 100% teachers. Another 
third were taught by between 41Z and 60Z of the teachers, 
and tha renaining third skills were taught by between 
21% and 4dX of the teachers. In the area of aath, one 
quarter of the skills assessed on the Math Scale were 
taught by between 61% and 100% of the teachers.'^ Another 
quarter ware taught by between 21Z and hOX of the teachers, 
and 15Z of the skiUa ware taught by between 4U and 60Z 
of the teachers. Flgitrea 3 and 4 illustrate the percentage ) 
of teachers who reported teaching selected skills in the 
areas of aath and language in the spring of 1983. 

CMdtxact Validity. Two types of construct validity 
were eAa bit shed for the Head Start aeaaures: validation 
of the hierarchical developaentel structure of each scale 
and validation of the assuaption that each scale in the 
HSHB represents a unidlaensiooal construct; that is, that 
each acalc sMasures a separate conatruct. 

Validation of the developaentel structures involved 
application of latent claas aodels (Bergan, 1983) to- everx^ 

-35- 

37 



f 





ripe Ion ' 



t. Take Curtis in » converaat Ion 

2. 'Us« appropriate farewell statenent y 

3. Told 9hort stonr-cxplaih wtijF 0o«ethlng happ«iied 

4. Ub# RreeClng appropriacely ^ 

Use correct fovv to describe slse coaparlsofi 
6. Sequeitce 3 pictures to iXluatrate a scory 
Y Explaio aoAething ba»ed on social Tule 

J|« Repeat sentence word for word - 1 descriptor 

9. State gaae's objective 

ID. Ask queac ions -to leam about people 

IK Take turns and ft«in(«io topic of c<^er»ation 

12. Say larger^of 2 gnupm contaloa aore 

•i i. Use appropriate greeting on pbom 

14. lUpeat sentence ««ord for word - 2 descriptors 

IS« Plural ixe regulkr nouns appropriately 

16. Identify self on phone 

I/. Recognise need for introductions 

t6. Lal»el steps to be tsken on path 

19. Say largest of 1 groups contains aost 

20. .Describe a turn in a path 

21. Ask question on pKone to find out sewthlng 

22. Art <»ul a sentence given in the passive 
21. C-orrectly use irregular past tense fon^ 

24. Use regular pass^««»ive forv approprlstely 

25, Act out sentence with 2 dependent clauses 



Percentage of Teecbers Teachl^ Skill 
0 I 20 . AO . 60 . 80 . 100 



\ 



20 



40 



60 



80 



100 



TiKurr i. Percent ^Ke of tf^therfJ whi* r<»prfrred teaching selected language skl11»- 
Sprlng I<*81. 



BEST COPY AVAILABLE 



ERIC 



f 



Sliill Oracrlpticm 



PercaaCAge of Tfischer* TeAchlog Skill 



1. Cotmtins between 3 and 5 object* 

2. Cotmtliift ovtt looA to m iwK>r b*cw««p 6 aad 10 

3. Tell vhich objMt im lov^sr* »horc«r» etc. 

4. Tall lAicli objact la bigsar, a^llar* etc. 
IdetKifir the mabers of object* lis a Mali grouf 

6. Ideocify uritteti otMberala to 5 

7. Counting oot lood to a. mwber bcCvMc 11 nl 20 
8« Hatcb MMarala op to ) tritb groupa of objecta 
9» Tell bov mmnf la a aaall net after taking Mae 

10. Identiff tbe poeltioe of ae object la a rov 

11* Ideotiff «ritt«i otffiberala up to 20 

12. Judge • eeta mm t after addieg to a«e aet 

13. Judge 2 abort rove of aqtaal leagkh aa equal ^ 
14* Judge 2 loog ruwe of eqwal l«sgtb ea e^l 

13. Judge 1^ eeta aa • after teUng froai oee aet 

16. Judge 2 equal length nmm of uoequal eo. as # 

17« Judge 2 abort ttaequal lei^cb roue of • ae. as • 

1S« Judge 2 lottg aa«4ual letigtb robe of « eo. ae « 

19. CotsMing «9 to 10 fro« e aiBber h etuae a 2 aad 5 

20. Tell biar aeay io e large set after taSdo^ som 
21 « le etoTf-add flaall sets ebiari^ how aaoy in all 
22* Addieg tm> nail eeta of objects 

23. Judge » eats - efter eddiim uoequally to botb 

24« Addiag tiMo lerge eete of objects 

23. le «tory-edd lerge sets showing hou mof in ell 



20 



40 



60 



80 



100 



20 



40 



60 



80 



lOO 



Figure 4, P9rceocj*f(e of ceschera %#ho reported tes^hiag selected aeth skills- 
SprlnR 1983. 



BEST COPY AVAILABLE 



39 



ERIC 



measure to determine the ordering an^ong tasks. Two kinds 
of order were examined. One involi^ ordering tasks by 
difficulty from easy to hard* The second Involved prerequisite 
ordering. In prerequisite ordering, easier tasks are 
necessary to the mastery of harder tasks. Both types 
of ordering occur in developmental sequences. The results 
of the latent class analyses for the Head Start measures 
indicated thai In the majority of Instances the liypotheslzed 
sequencing of skills was confirmed b^the data. 

Construct validity questions related to the uniqueness 
of each scale involved In the assuaptlptr^ tha t each of 
the content areas targeted far assessment I.e., Language, 
Hath, Nature and Science, Perception, Reading, and Social 
Development reflects a separate path of development. 
This assumption was investigated using confirmatory factor 
analysis (Joretfkog & Sorbom, 1979). As in latent*-class 
analysis, confirmatory factor analysis involves comparison 
among models. The results of the confirmatory factor 
analysis Indicated that a model which asai^ed that each* 
of the six measures in the battery would reflect a separate 
factor was preferred over other m>dels examined. This 
model was congruent with the hypotheses un(ierlying measures 
construction. 

Criterion-related Validity. It la useful to assess 
the extent to which the HSMB relates to other measures 
of achievement used with young chlldrerf. Other existing 
measures may be thought of as criterion variables to which 
performance on the HSMB should be related. Thus, evidence 
of relationships between the Head Start measures and other 
existing assessment devlceSs^helps to establish the 
criterion- related validity of the instruments. 

In order to establish crl t^rlon-rela ted^ va lldl ty, a 
^|»ample of the children, who received the spring. 1983 versions 
of the HSMB, were administered the Metropolitan Readiness 
Test (MRT) and the Preschool Inventory (PSI). The sample 
sizes rangid from 56 to 109 children. The Languag?, ifath, 
H^ure and Science, Perception, Reading, and Social Development 
8ca+«s correlated . 10 , .39*, .38*, .17, .27*, and .10 
with MRT scaled scores respectively; and .50*, .66*, .66*, 
.41*, .71*, and .6?* with PSI scores respectively (asterisks 
denote significant correlations). The low corfel^tlons 
of some of the HSMB scales ''with the MRT nay be related 
to the lack of relationship between the MRT and the Head 

-38- 

^5 0 



Start curriculum. The correUtlons of the scales with 
the PSI were quite high coasidering that the PSI is a 
global measure of achievement and that the HSMB scales 
measure more specific content ar^as. These correlations 
are important since there is evidence that the PSI does 
pr^edict later school achievement for preschool chll(fren# 
The findings should nevertheless be considered tentative 
since the sample sizes werefMall* 

Spciolingiilstic Validity. S'ocioLingt^stic validation 
studies were conducted to assess the adequacy of the Danish 
version of the measures for children speaking different 
varieties of Spanish. After t^Ea 11 and spring field tests^. 
a group of reviewers were selected who irare experienced 
with the measures, highly proficient in Spanish and English, 
and were speakers of the major varieties of Spanish found 
in the Head Start population* Each item in the Spanis|^ 
version of the measures was expected to meet the following 
criteria: (1) the Spanish should sound natural; (2) the 
language should not be above the level of the children's 
language, i.e., not too formal or adult- like; (3) the 
Spaniikh and English versions of those items Intended to 
be parallel simuld bm equivalent in meaning; and (4) the 
variety of Spanish used should be understandable to children 
speaking different varieties of Spanish. The reviewers 
rated each Item aa adequate or inadequate on these criteria. 

Table 4 presents a summary of their* ratings. Results 
of the analyses of ttwlr ratings indicated that the vast 
majority of the items were Judged to be adequate. When 
an item was found to be indadequate in some respect, reviewers 
supplied suggestions for revisions. Revisions were made 
based on their input as well as the input of an expert 
in Spanish sociolinguis tics. 



41 



Mean Rmtwr and Percent of Spanish I teas 
Jadged Adequate by Berievers 



Hatnral Chlld-Lerel Eqalvalent Varletjr 
Somidlns Lancnage Meaning Daderstandable 



N 


547 




523 


527 


549 


X 


98.7 




94.4 


99.8 


99.0 


Total Item 


1 










lated 


554 




554 

« 


528* 


554 



^Iteas oot parallel in Spanish and Bi^llsh ware not ratad^ 



Progtav Variables Associated with Achiereaeiit ow the HSMB 

Part of the yalldatloa of the HSMB included an examination 
of Ihe Influence of Head Start educational proglaa variables 
pn achlev'ement assessed by means of the HSHB. Three major 
types of Instructional program variables were examined: -( I ) 
classroom variables^ (2) policy variables^ and (3) background 
variables • Classroom variables are characteristics of 
a classroom that are directly controlled by the teacher 
In the classroom and thought to be highly related to achie-^ 
vement. Policy variables are program and classroom cliaracter- 
is tics and resources that are asst»ed to have an Indirect 
effect on learning by influencing the activities th^t 
ta1ce place in the classroom* Policy variables are amenable 
to alteration by administrators but are not generally 
subject to direct manipulation by classroom teachers and 
aides* Background variables are characteristics of children 
and their families. 

Learning Opportottltlea/ The first classroom variable 
examined was the amount ay time devoted to providing learning 
opportunities. Teachers participa ting in the project 
were asked to indicate the amount of time In their daily 
schedules that they spent providing teacher initiated 



-40- 



ERLC 



42 



* 1\ 

Learning opportunities within specific content areas. 
The amount of tine allocated to such learning opportunities 
was found to be related to student achieven^nt in the. 
areas of aath^ nature and science, social db yelopaent^ 
and perception. This means that teachers who spent more 
ttme providing le&^rnlng opportunities had students who 
demonstrated higher levels of achievement on the HSMB 
by the end of the program* 

Teacher Rnovledge of ^Children's Skilla* The amount 
of knowledge that teacl^rs tuve about what tlMir students 
knov and don't know is related to achievement. Each teapher 
involved in the project used a Plamalmg Gmidm to Indicate 
for each child those skills that had been maatered, those 
skills that had not been mastered, and those skills tlut 
ha(l not been taught. This Information was compared with 
the actual performance of each child on the HSHB in order 
to obtain an index of "teact^r knowledge of childten^s 
skills." The reanlts indicated that the knmrledge variable 
vaa related to achievement on all six scales in the Battery. 
This means that teaclMrs who were sensitive to *a child^s 
level of skill in a given content area had a greater impact 
on child achievement than teachers who lacked sensitivity 
to akin level. This binding suf^rts the view that a 
teacher who knows what skills a child poaseasM and what 
skills a child does not possess is in a better poaition 
to help the child than a teat her who doea not have that 
kind of inforsatlon. 



ladMdimllamtloa. Da ta obtained from the Plmmnlng 
Guide were used to determine level of difficulty of the 
skills teachers reported they were teaching to each child. 
Latent trait estimates of item difficulty wre used to 
code the difficulty levels of each task reflected In the 
Planning Galde. An instructional difficulty index was 
computed by averaging the difficulty levels for the skills 
taught to each child. The difficulty level of skills 
that were taught was calculated for individual children 
as well as for classes as % whole*. many cases the 

standard deviation for the class was zero/' This indicates 
that all children In such classes were being provided 
with the same learning opportunities* In Readings Lev^l 
f skills, for example, R0% of the teachers reported teaching 
the name skills to all children, while 20% provided for 
some Individualization. Kor Reading, I,evel II skills. 



85'^ reported teaching the same skills to all children 
and they>ther 13% reported some Individualization. 

The poflcy variables examined In the project fall 
ln.to three broad categories: program exposure , classroom 
composition, «nd teacher training and quallf lcatloi\s« 
Each of these is described below. 

Program Exposure. Program exposure was defined as^ 
the length of the instructional day, the length of the 
Instructional year, and the number of days- during the 
instructional year that each child had actually been In 
attendance. Data for these three Exposure variables were 
obtained from attendance forms and teachers* schedules. 

The policy variable that was foind to be consistently 
associated with achievement on the HSHH was program exposure 
expressed as total days present at Head Start. The himiber 
of days that each child had been In attendance was related 
to achievement on the Language, ffath. Nature and Science, 
and Perception Scales. This means that those children 
who were rarely absent showed higher levels of achievement 
on these four measures by the end of the school year than 
those children who were absent often. Reading was again 
one of the Scales for which no effect was found; however, 
the lack of emphasis on teaching reading skills provides 
a plausible, explanation for this finding. The other Scale 
for which an effect was not present was the S^lal Develop- 
ment Scale. There is no apparent reason for the observed 
lack of effect for this scale. 

No sigjdlf leant relationship was found between the length ^ 
of the day and performance on the HSMB nor between the 
length of the year ind performance on the HSHB. A plausible 
explanation for the lack of observed relationship between 
the length of the day and achievement may have to do with 
when learntng opportunities are provided to students. 
The number of learning opportunities was found to be an 
Important prediction of higher levels of achievement on 
the HSMH, but it may be that there Is not a great deal 
of difference between the amount of teaching that occurs 
in half-day and whole-day programs. A possible explanation 
for the lack of significant relationship between performance 
on the HSMB and length of the year is that there was not 
a £;rea t dea I of variability In the length of the year 
across programs. There may have been more variation In 

42- 

44 



ERIC 



the umber of days a child ifasXpresent (not absent) than 
In the total days of the program.^ 

Cleaerooa Coapoaltloii* InforsAtlon on class size and 
staff/child ratio were gathered In ordiir to exaslne the 
relationship of these variables to perfomance on the 
HSMB* Classroofl obserratlons were conducted at four different 
tines during the year* Means were obtained across the 
four obserratlons. The Man class slse ^as foond to be 
about 17 with one teacher and one aide present on the 
average* Mo significant relationship was found between 
class' slse and perforoance on thm Measures , nor ^ was a 
relationship found betmen staff/child ra tip and achlerefient. 
Although class slse has been found to be related to achievement 
In other studies (see, for example, Snlth & Spence, 1980), 
the aost plausible explanation for the lack of reUtlonshlp 
In this study seemsr to be the lack of variability In class 
slse and staff/child ratio among the classrooms In the 
sample. This Issue Is again being examined In the spring 
19S4 pilot study described In Chapter IV. 

Teacher Ckaractmristlcs. Information oA teacher qualifi- 
cations examined In the Head Start Measures Project Included 
amount of education, degrees and certifications obtained, 
and amomft of prior experience In teaching. Data on these 
variables were obtained from a classroom staff questionnaire. 
Teaching experience was defined In terms of the nonber 
of years of prior teaching activity Including experience 
as a Head Start teacher, a Head Start aide, a Head Start 
volunteer, a preschool tc^acher, a preschool aide, public 
school teacher, private/parochial school teacher, home 
Tlslt^k', and day care classroom staff mMiber. Information 
on teacher training Included amount of CDA training and 
amount of Head Start Inservlce training during the current 
year. CDA training was calculated In months and Inservlce 
training was given In days. 

Pew relationships were found between teacher characteristics 
and spring achievement levels. Amount of Head Start Insecvlce 
training (measured in days) was related to spring achievement 
In Language and Reading, while amount of CDA training 
(calculated in months) and possession of a CDA credential 
were related to spring achievement on the Perception Scale. 
The general lack of relationship beti^en teacl^r training 
and performance on the HSMB may be due to the fact that 
data were not gathered on training related to specific 



content areas. The ^relationship between training In content 
areas related to the HSMB and children* s achler^ent on 
the HSMB is being examined In the spring 1984 pilot imple- 
mentation study « 

ftackcround Variables. A number of background variables^ 
were examltied along i^th policy and classroom Instructional 
variables in the determine tlon of program effects. Age, 
recorded in month^t was included as an index of growth 
that could be used to separate nonlnstructlonal from Instruc- 
tional influences on development (Bryk, StreniOf & Ueisbergy 
1980) • Other background variables Included so<;loeconomic 
variables and family background v^^limbles shown to affect 
achievement (Bergan & HendeTSon^ 1979). The socioeconcmic 
variables included the primary family provider's occupation 
and income. These variables were obtained frc^ a' family 
background data sheet. The providerVs' occupation was 
recorded using Duncan's SEI scale. The family variables 
Involved the mother's education and the ntnt^r of siblings 
and parents In the family. This information was obtained 
from a family questionnaire. 

Of '^hose background variables examined in the Head 
Start Measures Project, the only variable that was "found 
to be related to achievement was age. Based on the child- level 
status score agalysls, age was significantly related to 
achievement on all six scales. Age influences spring 
achievement in two ways. It affects spring achievement 
Indirectly by influencing fall achievement which in turn 
affects spring performance. In addition, it has a direct 
influence on spring achievement.. This finding shows that 
measures performance is influenceti by developmental factors 
associated with age as well as by participation in Head 
Start. 



>t ■ ' ■ ■■■ f " I , ■ ■ ■ 

DISSEHIMATION AMD VTILIZATIOM PILOT STUDY 



In February 'of 1984 a six'-aMmth project Ma Initiated 
to pilot the diaamination and oae of the Head Start Meaaurea 
Battery prior to a broad acale iaplmientatioa planned 
for the fall of 1984. The parpoae of the project ia to 
develop and evaluate training aaterialSt produce the HSHB 
teat Mnuala and Mnipulable ttateriala» train Head Start 
peragtinel in aeaaurea oae and interpretetimit mmnmsB aeaaurea 
adminiatra tion feasibility^ and analyse data on child 
perforBuince and prograa variablea, Thia chapter deacribea 
the design of the a tody » the developaent of nateriala 
and cosputer prograna, the aavple of aitea involved, tiM 
training activitiea, data collecti<m procednrea, and the 
planned evaluation of tha disaesiaaticm effort. 

St^Af DeaisB 

. Two inpleaenta tion strategies are being examined: (l) 
the Head Star^ Staff Hodel and (Z) the Site Haaager Hodel. 
Thirty prograaa were selected to ^participate and vera 
assigned to one of the tiro smdela. Under tl^ Head Start 
staff nodel, the HSHB was Ispl evented 'by Head Start staff « 
An Educational Coordinator froa each progran was firat 
trained in the HSNB systea. He or she Uwn selected testers 
f roo among exi s ting Head S tart s taf f , trained thes, and 
supervised the testing. Under tlM Site ffainsiger Hodel^ 
an outside person was hired by the Head Start program 
to manage the testing activities. The Site Manager aelected 
and hired testers » trained them, and supervised data collec- 
tion. Late ii^ the spring the Educational Coordinators ^ 
from the Site Manager Model altea were invited to training 
sessions so that they could become familiar^' with the 
path- referenced assessment systcmi. 

All testing was completed by early June and HSMB data, 
as well as data on selected program variables, were sent 
to the University of Arizona for^ processing* A series 

-45- 

ERXC 47 



of six meetings were scheduled for late July to discuss 
the interpretation of results of the HSMB uid how results 
might be used to inprove the educational coaponent in 
•Head Start programs and classcooas. A final report will 
b« available in August and will present an evaluation 
of the training laaterials, the training process, the dAta 
collection procedures/ a^d a description of disseainanon 
obstacles and successes encountered under . each of the 
two models. 

l^eralopMnt and Production 

The six scales of the HSMB have been developed and 
continually refined since 1981. The pilot phase involved 
an additional set of revisions and the professional printing 
of 125 copies of the measures. 

Four manuals were produced in the winter of 1984. 
The BjBBainer's Manual describes the content of the HSMB, 
specifies administration procedures, and explains how 
to interpret scores. The Data Collection Training Mnnoal 

was developed for use by Educational Coordinators (and 
Site Managers) In training data collectors to administer 
the measures. It contains a suggested training plan and 
all the information necessary to adequately monitor the 
testing process. The Technical Manual describes the psycho- 
metric properties of the measures Including validity, 
reliability, and Item information. It also describes 
procedures employed to avoid and eliminate linguistic 
and cultural bias. A manual for use In interpreting test 
resul<ts and using the Information to evaluate and Improve 
the educational component Is also being developed based 
on feedback from Head Start personnel participating in 
the pilot study. 

In addition to the manuals, a series of 1/2" video 
cassette tapes were developed and produced for use In 
training sessions. They include a training module demon- 
strating administration procedures for each of the six 
scales of the HSMB. 

The development phase of the study also Involves the 
writlnp, of computer programs for use In scoring the tests 
and producing a series of reports for both Individual 
students and classes. Computer programs have also been 
written for Item banking. The Item banking technology 

48 



allows IteM to be placed on a coonon scale. This is necessaffy 
for a measures ayBtmm to be adaptable to local prograia 
needs. 

PartldpaUnc Sites end Children 

Twenty -eight Head Start prograes participated in the 
dissealnation and utilisation pilot study. Participating 
program were selected fto« anong pfograes that Yolunteered 
and that were willing to corait the' tine and resources 
necessary to carry ont the pilot. The 28 prograns represented 
all of the ten Head Start regions. They also Represented 
a range of prograa sisea in both urban and rural settings. 
TweWe of the sites have Spanish- speaking children. It 
was considered iaportant to ^nclnde a sufficient niaber 
of Spanish- speaking children so'Oat procc^lures for deterainii« 
whether to administer tests in English, Spai^sh, or bilingually 
cou^ he assessed. Prograaa were assigned, randomly in 
most cases, to either the Head Start Model or the Site 
Hanager Model. Within each site 30 children were selected 
at random to he tested. 

laplememtaUoa 

Following a week long training session, Kducational 
Coordinators and Site Managers selected and trained data 
collectora. They set up testing schedules and supervised 
the process of data collecUoo to ensure that procedures 
foz^^oper administration and scoring were followed. 
In addition to administering the HSMB, each participating 
program collected Information oh selected program variables 
including: class schedules, attendance, classromB^co^posltlon, 
teacher characteristics, and family income and occupation. 
Plamains Gmides were filled out by ^rticipating teachers. 
Por each child assessed with the HSHB, the teachers indicated 
In the PUmming Cmide what skills had been taught to the 
child, what skills the teacher felt had been mastered 
by the child, and what skills had not yet been mastered. 
Data collection activities were completed by early June 
and data were sent to the University of Arizona for processing. 



-47- 



49 



The evaluation Is currently in progress and the results 
will be provided'^ a separate report. The cWtral purpose 
of the evaluation activities is to determine the) effectiveness 
of the training, the effectiveness of each of the tianuals 
conprlsing the measurea package, and of the Implementation 
of t1ie HSMB system under conditions In which sMasures 
administration was directed by local Head Start personnel * 
and under conditions in which measures administration 
was directed by outside personnel. Another major purpose 
is to assess the administrative feasibility of the measures 
and to make changes as appropriate. A subordinate' purpose 
is to provide additional information on program effects 
and measures characteristics identified in the measures 
development phase of the project. Evaluation activities 
for'the current phase fall into four categories: evaluation 
of training, evaluation of ^system implementation, evaluation 
of program effects, and evaluation of measures characteristics. 

Evaluation of training will focus both on the quality 
of. the training experience and on the effects of training 
on project participants. EvaJjiatj^ data were collected , 
In the training workshops through the use of ques tlonnairpa. 
The evaluation will provide feedback for refining training 
materials and procedures prior to their dissemination 
on a broader scale. 

The evaluation of training effects was carried out 
to determine the extent to which participants gained the 
necessary skills to Implement the HSMB In local Head Starts. 
Data on assessment sklfVs were gathered using shadow scoring 



The evaluation of system implementation is being carried 
out to determine the extent to which the skltls acquired 
during training are Implemented by participants In local 
Head Starts. The evaluation of system Implementation 
will focus on training In local sites, data collection, 
the Interpretation of results of testing, and the use 
the results of the measures ^ for Improving educational 
olans. A variety of Instruments and procedures are being 
fted to assess system implementetlon. Site Managers and 



procedures and a data 
during training and on si 




monitoring form both 




-48- 



ERIC 



50 



Educa tlonal Coordinators iMed the shadow scoring atid mml torli% 
methods fonis to disteraine the adequacy of data collection. 
Quality control procedures designed during the Measures 
deTelopaent phase of the project were ijspleaented at the 
University of Arizona to serve as a further check on the 
adequacy of date collection. The adninistration of the 
Planning Gnide was era lua ted by T/TA providers . In terpre ta tion 
skills and planning skills will be evaluated frooi data 
obtained during group seetings held In July. 

Evaluation planned for, the project includes the exaainaticm 
of program effects using test performance data, PlMmli^ 
Cmides, and program variable instmnents. The program 
variable instruments will provide InforMtion that can 
be used to reexamine the relationship betwrnoi program 
characteristics and performance on the HSHB. 

The administration of the measures during the spring 
of 1984 affords the opportunity to apdate information 
on their psychometric characteristics. Moreover, the 
examination of the properties of the measures affords 
a means for testing the implementation of the item banking 
technology being developed for the projectr^ Examination 
^f item statistics including item difficulty and discrlmimtion 
will be particularly important. The properties of test 
items may change over time. In order to maintain ItMS 
that can be used to form scales accurately reflecting 
children's abilities, it is useful to update item stetistlcs 
whenever the meAsures are administered on a broad scale. 
Latent trait techniques employed during measures development 
will be used to update information on itra characteristics 
during this pha^ of ti» project. 

Although resulte are not yet available from the pilot 
study, informal feedback has already been used to revise 
the battery. Each of the six scales has been shortened. 
Items that were least relevant to the Head Stert Program 
were deleted.. In addition, the KmMdLmer's HmMsl was 
simplified. These changes will facilitate the use of 
the HSHB Head Start programs In the fall of 1984. 



^i 



* -49- 



o 51 

ERIC ^ 



I 



References 



Bergati^ J.et* (1983). Latent^cless nodels In educational 
research. In E.W, Gordon (Ed#)t RwrlMi of Eesearch 
in Edticmtloay Washington, D.C. : American Educational 
Research Association, 10, 305-360* 

Bergan, J.R« (1981). Path referenced assessnent in school 
psychology* In T.R, Kratochnill (Ed.) , Advances in 
School Psychology. (Vol. 1). Mew Jersey: La«rrence 
BrlbauB Associates. 

.1 

Bergan, J.R., Stone^ C.A., & Peld, J.K. (1984). Rule 
replacMent in 'the developsient of basic nunt^r skills. 
Joanykl of Educational Psychology, 289-299. 

Bergan, J.Rn. , Anderson, I>.0. , Feld, J.K. , Henderson, R.tf., 
Johnson, D.H., Lane, S., Mott, S., Parra, E. , Robinson, 
L.,. Stone, C.A., & Strarner, J. (1984). Bestd Start 
Meesores Final Report. Prepared for the Departaent 
of Health & Hunan Serrices nnder Coii tract No. 
HHS- 105-8 l-C-008. * y 

Grottberg (1969). Rnrlw of Research s 1965<-1969. Msshingtcm, 
D.C.: Project H^d Start, U.S. Office of Economic 
Opportunity. \ 

Hambleton, R.K. & Murray, L.M. (1982). SiMe goodness 
of fit inTes f isa tions for iteH response aodels. Amherst, 
MA; University of Massachusetts, Labora t9r]r of Psycbonetric 
and Evaluative Research Report, Mo. 129. 



Hoad Start Program PorforMnce Standards. (1975). ^Office 
of Child Development, Department of Health, Education, 
and Welfare. 

Joreskog, K.C., & Sorbora, D. (1979). Advances in factor 
analysis and structural equation models. Cambridge, 
Mass: ABT* \ 

Lo^, P.M. (1980). ApplicatiMis of itesi response^^'^tih^ry 
to practical testing problfwa. Hillsdale, N.J.: Lawrence--^ 
Erlbaum Associates, Inc. 



-50- 



ERIC 



52 



Snlth, A. & Sp«nce, C. Cl^SO). National day care study: - 
Optlaizing th« day care envircniaent. Aasricaa Jonxnal 
of Orthopsyehiatxy, 50 (4), 718-722. 



-51- 



ERIC 



•■53 



Project Director: 
V John R.' Bergati 



GoV^l 



niment Project Officer: 
Allen N. Salth 



University of Arisi^W 



AdBinlstration for Children^ 
Youth and Faalliea 



Project Managers 

Jason |C. Feld 
Donna M.^ Johnson 
Kathleen Silvers 
^Cleaent A. Stone 

Au tto rs of^ the HSMB s 

w ~: — 

Language Scale: 

Joyce C. Swamer 
\ John R. Sergan' 
Stacey IE. Mott 
Adam Schnaps 
Teresa l(« Anderson 
Elena Parra 



Hath 



John R. Bergan 
Jason K. Feld 9 
Clement A. Stone 
Adas Schnaps 
Joyce C. Swamer 
Teresa K. Anderson 
H Elena Parra 
« 

Nature and Science Scale: 



University of Arizona 



University of Arizona 



University of Arizona 



Ronald U, Henderson 



University o| California, 
Santa Crus 



ERIC 



Perception Scale: 

Rosemary A. Rosser 



\ 



\ 



University )of Arizona 



-52 



51 



Reading Scale: 
Joy Hestiad 

Social Developaentr Scali 
Sadie Grisoiett 



University of California, 
Santa Cruz 



Indiana University 



3 



55 



53 



For iBore Inforawtion contact: 



Dr. Allen N. Snltb " 

Adtelnistratioo for Children, Tout|i and Faailies 
400\Slxth Street, S.W. , Roo* 5143 . 
Dpnoboe Bldg. 

Washington, D.C. 20dl3 Vs;-. ' ' 

(202) 755-7724 



Center fot Educational Evaluation ^ NeasureMnt 

College of Education 

The UnlVersit^F of Arisona . 

Tucson, Arisona 85721 ^ 

(602) 621-^85^ • ^ 



-54- 

56 



